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Orbit-based dynamical models of the Sculptor dSph galaxy 
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ABSTRACT 

Wc have developed spherically symmetric dynamical models of dwarf spheroidal galax- 
ies using Schwarzschild's orbit superposition method. This type of modelling yields 
constraints both on the total mass distribution (e.g. enclosed mass and scale radius) 
as well as on the orbital structure of the system (e.g. velocity anisotropy). This method 
is thus less prone to biases introduced by assumptions in comparison to the more com- 
monly used Jeans modelling, and it allows us to derive the dark matter content in a 
robust way. Here we present our results for the Sculptor dwarf spheroidal galaxy, after 
testing our methods on mock data sets. We fit both the second and fourth veloc- 
ity moment profile to break the mass-anisotropy degeneracy. We find that the mass 
of Sculptor within 1 kpc is Mi kpc = (1.03 ± 0.07) x 10 8 M Q , and that its velocity 
anisotropy profile is tangcntially biased and nearly constant with radius. For an NFW 
dark matter profile, the preferred concentration (c ~ 15) is low for its dark mat- 
ter mass but consistent within the scatter found in N-body cosmological simulations. 
Very cuspy density profiles with logarithmic central slopes a < —1.5 are strongly dis- 
favoured for Sculptor. However, a firm distinction between a central core (a = 0) or a 
shallower cusp (a — 1) cannot be made. 
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1 INTRODUCTION 

The existence of dark matter has been invoked to explain 
discrepancies in the observed kinematics of (systems of) 
galaxies. Especially in the last 30 years it has become a key 
ingredient of our current cosmological model, the A cold 
dark matter paradigm (hereafter ACDM). N-body simula- 
tions have made clear predictions on how dark matter should 
be distributed in the Universe. Navarro, Frenk & White 
(1996) showed that simulated dark halos have a universal in- 
ternal density distribution, now known as the NFW profile. 
Although there have been some revisions, the general form 
has remained, and the inner regions of simulated dark halos 
are found to b e cusped with logarit hmic slopes in the range 
-1.2 to -0.75 (|Navarro et alj|20ld h CDM simulations have 
also revealed the existence of a universal spin distribution 
and of relations between the characteristic p arameters of a 
dark halo such as concentration and mass fe.g lBullock et all 
l200ll h 
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The predictions of the ACDM model may be tested us- 
ing kinematic data. Cleaner tests are generally obtained us- 
ing tracers located at large distances, i.e . in the regions that 
are dominated by the dark matter (e.g Romanowskv et all 
120031 : iBattaglia et~al]|2005l . 120061 : IXue et alj|2008l h In these 
examples, a relatively accurate measurement of the mass 
contained within a given radius can be obtained, but con- 
straints on the density profile depend on good knowledge of 
the spatial distribution of the tracers, which may be some- 
what uncertain. Another possibility is to use galaxies that 
are dark matter do minated at all radii, such as low surface 
brightness systems ((de Blok 2 id 1 ). 

An example of the latter class are the d warf spheroida l 
(dSph) galaxies satellites of the Milky Way (|Mated Il998h . 
These appear to be the most dark matter dominated galax- 
ies with total dynamical mass to stellar light ratios in the 
order of 100-1000 Mq/L© d erived under the assumption of 
dynamical equilibrium (e.g. IWolf et "all l20ld h The nearby 
dSph galaxies have the additional advantage that individ- 
ual stars can be resolved, and their red giant branch (RGB) 
stars are bright enough to measure line-of-sight velocities 
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with errors of a few km s _1 (|Mateo et al.lll99lh . The dy- 
namical modelling of these objects is relatively simple since 
they are rather round, pressure supported and show little 
or no rotation. Their high dynamical mass-to-light ratios 
makes these systems ideal to study dark matter halos, es- 
pecially their internal structure and to constrain their inner 
density profiles. 

Most of the Milky Way dSph satellites ha ve been mod- 
elled using the spherical Jeans equations (e. g. Klcvna ~et al.l 
200ll:lBattag]ia et al.ll200Sl ; IStrigaxi et al]|2008j ; lLokadl2009l ; 
Walker et al.ll2009l ), while for moie distant objects, such as 
the dSph satellites of M31, masses have been derived from 
the average velocity dispersion and project ed mass estima- 
tors (jKalirai et al.H2010l : [Collins et alj|2010h . In Jeans mod- 
els one has to specify (i) the form of the light distribution, 
(ii) the density profile (or equivalently the gravitational po- 
tential) of the dark matter component, and (iii) velocity 
anisotropy of the stars. These characterise a given Jeans 
model, from which the second velocity moment projected 
along the line-of-sight can be computed. This is then com- 
pared to the measured line-of-sight velocity dispersion of the 
stars at different locations across the galaxy to establish the 
performance and characteristics parameters of the specific 
model. 

Jeans modelling suffers from a number of limitations. 
Firstly the functional form of the velocity anisotropy has to 
be specified a priori while it is generally unknown. This is 
because precise measurements of the proper motions of stars 
in dSph are well beyond reach with current instrumentation. 
Also inherent to the method is the comparison between the 
moments of the model to those of the data which requires 
binning of the data and generally implies loss of informa- 
tion. It is also important to note that there is no guaran- 
tee that the resulting distribution function is non-negative 
everywhere, a requirement for it to be physical. Nonethe- 
less, there have been interesting discoveries based the use 
of the Jeans equations and which are robust to assumptions 
of the underlying anisotropy profile. These include for ex- 
ample, the existence o f a possible commo n mass scale of 
dwarf spheroidals (e.g. IStrigari et al.l [20081 ) . and the tight 
constraints on the total mass within the half-light r adius of 
these systems (IWalker et al.ll20ld ; IWolf et alj|2010h . 

Recently, lAn fc Evans! (2009) demonstrated that if the 
tracer population is supported by a spherical dark halo 
with a core or a cusp (less steep than a singular isother- 
mal sphere) , then the logarithmic slope 7 of the light profile 
and the velocity anisotropy /3 are related as 7 = 2/3 at the 
centre. This is valid if oy(0) > 0, i.e. only if the stars are not 
dynamically cold in this region. This would imply that the 
existence of a cusp or core at the centre is a consequence of 
the assumptions alone, if just the second velocity moment 
is modelled using Jeans equations. However, this result has 
been derived under the hypothesis that the anisotropy pro- 
file varies much more slowly than the stellar density profile 
and the gravitational potential at small radii. This hypoth- 
esis is rather restrictive since for example, the star number 
counts (or light profiles) of dSph are generally well-fit by 
cored distributions, and would also break down if the dark 
matter halo had an extended core. Therefore, for this the- 
orem (and implications) to be valid, j3 would have to be 
strictly constant near r ~ 0. 

The above discussions shows clearly that there is a need 



to go beyond the modelling of the secon d moment using 
Jeans equations. For example, lLokasI (|2002T ) proposed to use 
higher moments to constrain the internal dynamics of dSphs 
since the kurtosis profile depends mostly on anisotropy while 
the velocity dispers ion depends both on mass and anisotropy 
lLokas et alj (|2005h . hence this lifts some of the degenera- 
cies. Other possibilities would be to use par ametrised phase- 
space d istribution functions as pioneered by I Wilkinson et all 
(|2002al ) ( see also lAmorisco fc Evans! l201ll ~ or the made- 



to-measure technique ( Sver fc Tremainelll996l ; lLong fc Mad 
l20ld ). 

In this paper we ta ke a different appr oach and use 
Schwarzschild modelling (|SchwarzschiId|[l979f ) to probe the 
internal dynamics and characterise the dark matter content 
of the Sculptor dwarf spheroidal galaxy. The basic steps 
of the Schwarzschild method are to integrate a set of or- 
bits in a given potential, calculate the predicted observ- 
ables for each orbit, and then to weigh the orbits (with 
non-negative weights) to obtain a model that fits the ob- 
served data well in a \ 2 sense. This approach guaran- 
tees that the distribution function (which is reflected in 
the orbit weigh ts) is non-negative. T his method was orig- 
inally used by ISchwarzschildl (|l979l ) to prove that a self 
consistent solution in dynamic equilibrium exists for a tri- 
axial system, but was only implemented to reproduce the 
density distribution. The me thod was later extended to 
include kinemat ic constrains (|Richstone fc Tremainei [l984l ; 
IPfennigerl Il984h. Since then many code s have been de- 
veloped (e.g. iRichstone fc Tremaine] Il984l; iRix et al.lll997l; 
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2004 Ivan den Bosch et al]|2008l ). While first only the low 



est moments of the line of sight velocity distribution (mean 
velocity and velocity dispersion) were fitted, better data 
have led to the inclusion of higher moments in the fits. 
While the use of moments allows one to use linear or 
quadratic programming to find the orbit weights, also 
likelihood methods using discrete data have been devel- 
oped (e.g.lMerritt fc Tremblavll 19931 ; IWu fc Tremainell2006l ; 
Ich aname et al.l 120081 ) . A great advantage of Schwarzschild 
modelling is that it does not require the specification of the 
anisotro py profile, this is in fact an outcome of the model 
(see also lJardel fc Gebhardtll2012l , for an application on the 
Fornax dwarf galaxy) . 

Sculptor (Scl) is a dwarf spheroidal galaxy satellite of 
the Milky Way. It lies at high galactic latitude and is lo- 
cated at a heliocentric distance of 79 kpc. With an elliptic- 
ity of 0.32 (axis ratio is 0.68) it is not extremely flattened 
dlrwin fc Hatz idimitrioul 119951 ), allowing us to approximate 
and model Sculptor as a spherical object. Its luminosity is 
Lv = 2.15 x lO 6 I/0 and the latest est imate of its dynamica l 
mass is 2-3 x 1O 8 M within 1.8 kpc (|Battaglia et al.ll2008l ). 
Its (stellar) mass distribution can be well fitted with a Plum- 
mer profile wit h scale radius b — 13.0 arcmin (~ 0.3 kpc, 
iBattaglial 12007). Two large k i nemat ic data s ets have been 
compi led by iBattaglia et all (|2008l 'l and by IWalker et al.l 
l|2009l ). leading to a total ~ 2000 member stars with ra- 
dial velocity measurements with errors of ~ 2 km/s. As we 
show below, the combination of these two data sets together 
with the Schwarzschild method allows us to constrain the 
dark matter distribution of Sculptor and its internal orbital 
structure. 

This paper is organised as follows. In [J2] we will describe 
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the basic ingredients of Schwarzschild modelling, especially 
focusing on how it can be applied to dSph data. In |3]we 
validate our model on a mock data set motivated by the 
current Sculptor data. In [|4]we apply the technique to Scl 
data, and we present our conclusions in §j5j 



2 DYNAMICAL MODEL 

In this section we review some of the theory that provides 
the basis for our Schwarzschild method. We then describe 
how to generate models and focus later on how these can be 
fit to the observables. 



2.1 Generalities 

The phase-space structure of a galaxy can be specified by its 
distribution function (hereafter df) f(x,v), where x and v 
are the position and velocity coordinates respectively. The 
probability of finding a star in the volume dxdv is given by 
f(x, v)dxdv. All observables may be derived from knowl- 
edge of the df. For example the normalised surface density: 

fj.{x,y) = J dzdvf(x,v), (1) 

where z is the direction along the line-of-si ght. 

According to the (strong) Ijeansl (|l915l ) theorem, the df 
of a steady-state stellar system in which almost all orbits 
are regul ar, is a function of the isol ating integrals of motion 
(see also iBinnev fc Tremainel I2008T ) . Spherically symmetric 
systems (both in the tracer's density and the underlying 
potential) have only regular orbits and generally respect 4 
integrals of motion, the energy and the 3 components of the 
angular momentum vector. However, if the galaxy shows no 
rotation, due to symmetry, the df will depend only on the 
energy and the length of the angular momentum vector, i.e. 
f(x,v) — f(E,L). Furthermore if the local velocity distri- 
bution is isotropic, the df can only depend on energy and 
f(x,v) = f(E). 

Most dSph galaxies are so distant that the only phase- 
space coordinates that may be measured currently are the 
projected stellar positions on the sky, and the line-of-sight 
velocities of (a subset of) its stars. These can be used to 
derive the surface density fj,o{R) and the moments of the 
line-of-sight velocity distribution: 

Ho(R) = J dzdvf(E,L), (2) 

^( R ) = ~ Tm / dzdvvff(E,L), (3) 
mo(-R) J 

m(R) = f dzdvv\f{E,L). (4) 

Here R is the projected distance on the sky from the centre 
of the galaxy and i)m the velocity along the line-of-sight, after 
subtraction of the centre of mass mean motion. 

The above equations suggest that through comparison 
to the observables it should be possible to derive the form of 
the df. In some cases, it may be better to parametrise the df 
and try to estim ate its characteristic parameters by compar- 
ison to the data (jWilkinson et al . 20023; [A morisco fc Evansl 



l201ll) . However, in this work we prefer to use a non- 
parametric approach such as the Schwarzschild method. 
This method uses orbits integrated in a specific gravitational 
potential as building blocks. From these, light and kinemat- 
ical profiles may be derived and compared to observations 
through appropriate weighing of the orbits. 

In the case of a dwarf galaxy embedded in a spherical 
dark matter halo, the gravitational potential can be charac- 
terised by a few parameters such as: i) the (enclosed) mass of 
the dark matter halo Mum, and ii) its scale parameter tdm- 
Due to the high dynamical mass-to-light ratios of dSphs, we 
do not expect the stellar mass to have a significant influence 
on the dynamics of the galaxy. We a ssume a fixed stella r 
mass-to-light ratio of Mq /Lq = 1 as in IWalker et af] (|2007h . 
and hence from the light distribution we may directly derive 
the gravitational potential associated to the stars. In the re- 
mainder of the paper we shall refer to properties related to 
the stellar mass and luminosity interchangeably. 

Thus in practise, for a given set of parameters of the 
potential, we integrate orbits and match these to the ob- 
servations by adjusting the orbital weights. We then repeat 
this exercise for other values of these parameters. This can 
be used to establish the values of the set of parameters which 
result in a better fit to the observables. 



2.2 From the model to the observables 

O ur Schwarz s child meth od is based on many of the ideas 
of iRix et all (|l997l ) and Ivan den Bosch et all (120081 1. It is 
however, a new implementation that is optimised for spher- 
ical symmetry. Among other small improvements, our code 
can be run in parallel and is therefore significantly faster; 
furthermore for each orbit, we do not store the full line-of- 
sight velocity distribution but only its moments, which also 
reduces the computational load. 

We now focus on how to generate the observables, 
namely the surface density and moments of the line-of-sight 
velocity distribution of the models and how to compare these 
to data. 

For convenience we define / = I//I/ max the relative an- 
gular momentum (where L max is the angular momentum 
of a circular orbit of energy E), such that I £ [0, 1]. This 
enables us to define a rectangular grid in energy and rela- 
tive angular momentum. Since the Schwarzschild method is 
based on orbit integrations, the df may be seen as a sum of 
Dirac delta functions: 

f(E, L) = J2 f^ S ( E - E ^ L - hL^.i), (5) 

i,j 

where fi,i = 1 and fi,j > 0. 

To define the grid in energy and (relative) angular mo- 
mentum we proceed as follows. For the energy we choose N' E 
radii between a minimum and maximum radius spaced log- 
arithmically, and take the corresponding energy of a purely 
radial orbit. The minimum and maximum radii we consider 
are 0.033 kpc and 24.492 kpc, respectively. For each energy 
we choose N( relative angular momenta spaced linearly be- 
tween and 1. All orbits are integrated starting from their 
apocentre. 

We also define Nr radial bins on the sky, defined by 
radii at the edges Rk (k = 0...Nr). The borders are deter- 
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mined by the kinematic data, by requiring for instance that 
each bin contains a particular number of stars. 

In general, it is convenient to work with the (nor- 
malised) mass in a given radial bin: 

dm*(R) 



M, 



2nR^ (R)dR. 



(6) 



Thus the mass contributed by orbit of energy Ei and relative 
angular momentum lj in the radial bin k is: 



Am,. 



M, 



2nRno,i,j(R)dR. 



(7) 



In the Schwarzschild method this quantity is obtained 
by integrating the i, j orbit and calculating the fractional 
time this orbit spends in radial bin k. Since we integrate 
the orbit with a fixed time step, this is simply equivalent 
to counting the number of times the orbit crosses bin k, 
divided by the number of time steps. To reflect the spherical 
symmetry, at each time step the position and velocit i es are 
rotated randomly jV ro t = 25 times, as in iRix et alj j 19971 
Eq. 2). Each orbit is integrated for 100 orbital timescales 
iorb, with t or b = 2nr a ,/v c i TC , and where r a is the apocentre 
radius and u c i rc the circular velocity at r a . Each orbit is 
stored at 1000 points (separated by a constant time step). 
Therefore the total mass (contributed by all orbits) in bin k 



N' E N{ 

Am, fe \ - \ - , .s . . , Am*,i,j,k 

- 2_^2^ i g{Ei,L i )f i jL maK AE i A.l i x - 

«=1 3=1 



AI, 



M, 



i=i j=i 



(8) 

where g(E, L) is the density of states. The coefficients d it j 
are known as the orbital weights. 

We may now proceed to calculate the light-weighted 
second and fourth moments of the line-of-sight velocity dis- 
tribution in a given projected radial bin k as: 



M2,fc = 



M, 
Am.t 



M4,fe 



A'L 
Am„ fc 



EE* 

!=1 3=1 

EE4 

!=1 3 *=1 



2TrRfi 0tltj (R)fi 2 ,,, j (R)dR, (9) 

fit 



2nRno,i,j (R)iu,i,3 (R)dR, (10) 



where H2,i,j(R) and fJ<4,i,j(R) are the second and fourth mo- 
ment respectively of orbit i,j. The integral is also derived 
from the orbit integrations. However, instead of counting 
each time the orbit is found in bin k, we add the corre- 
sponding second moment in quadrature (and to the fourth 
power for the fourth moment) and at the end divide by the 
number of time steps. Note that the moments are linear in 
the orbital weights, which allows us to find a solution us- 
ing quadratic programming, while for instance the kurtosis 
(72 = M4/M2) is not. 

It is possible to consider the orbit weights (c^) as free 
parameters whose exact values will be determined through 
comparison to the observables. However this would imply 
that the number of orbits that are integrated to reproduce 
the observables is exactly equal to the number of free pa- 
rameters that define the df. Decoupling these two sets of 



quantities is clearly desirable, see e.g. ICretton et all (| 1999h ■ 
This procedure is known as dithering and results in smoother 
density distributions while keeping the number of free pa- 
rameters in the distribution function small. 

While we may use N' E x N[ orbits to reproduce the ob- 
servables, we choose only Ne X Ni = N' E x N' l /(N dE x N dl ) 
free parameters to characterise the distribution function, 
where we take Nd E x Nd t =8x8 = 64. The coefficients of 
the distribution function dj are related to the orbit weights 
{Ci j) as follows: 

where \ indicates the integer part, e.g. [i/Nd E \. Therefore 
Nd E x Nd t orbits share the same df coefficient. In practice, 
one can simply average the quantities obtained from the 
individual orbits. We choose Ne = 20 and Ni = 8, which 
results in 20 x 8 = 160 free parameters for the distribution 
function, but we integrate 20 x8x8x8 = 10250 orbits. 

To fit models to the data we generally use projected 
quantities (i.e. the observables). However, if one knows (or 
has derived) the df coefficients, it is also possible to make 
predictions for quantities that are not (yet) directly observ- 
able, such as the intrinsic (3d) density distribution or mo- 
ments of the full velocity distribution. For example, the mass 
contained in the (spherical) radial bin m contributed by or- 
bit i, j is 

A )TT<* ) 3d,i ,j ,m 



M* 



A-rt u t ,ij(r)dr, 



(12) 



where the integral is computed from the orbital integrations, 
and v*,i,j{r) is the radial density profile of orbit i, j. In prac- 
tise we use iV r = 50 (3d) radial bins, spaced linearly between 
f'min = kpc and r max = 1.5 kpc. Similarly we also store 
the radial and tangential velocity dispersions in these bins. 
Although we do not store the intrinsic properties beyond 1.5 
kpc, this has no effect on the way the projected properties 
are determined. Note that the intrinsic properties are not 
used in any of the fitting routines but may be used for in- 
ferring for instance the intrinsic velocity anisotropy profile. 

Orbits are integrated using the GNU Scientific Library 
(GSL) ordinary differential equation solver using an 8th or- 
der (Runge Kutta) Prince-Dormand method. We found that 
the energy is conserved to better than 0.1%. 

2.3 Fitting procedure 

2.3.1 Light distribution 

Our first requirement is for the model to fit the observed 
light distribution. We assume that this is known accurately. 
We require that the projected mass (or light) in each bin 
is matched within 1 per cent. Given our assumption of a 
constant stellar mass-to-light ratio, we make no distinction 
between surface brightness and stellar mass surface density 
in what follows. From the assumed brightness profile fj,*(R), 
we calculate: 

r-Rk + l 



Am, 



2nR^ t (r)dR, 



and thus require for each projected radial bin k that: 
Am, true, k Am»,fc 



A'L 



A'L 



< 0.01. 



(13) 



(14) 
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Note that the number of bins for the light does not have 
to equal the number of bins for the kinematics, in this work 
we choose 250 bins for fitting the light distribution. 



2.3.2 Kinematics 

To derive the line-of-sight velocity dispersion profile we cal- 
culate the second and fourth moment estimators of the line- 
of-sight velocity distribution fi2,k and p,4,k in bins containing 
at least 250 stars. Assuming that the measurement errors 
are normally distributed, and all measurements and errors 
are independent and uncorrelated, we can obtain fi2 of the 
population as follows. The expectation value of the second 
moment is 



where 



E[m 2 ] = E 



1 



N 



— jJ.2 + S2, 



(15) 



where e; is the unknown noise of measurement i, which we 
assume is drawn from a normal distribution with dispersion 
ffi (i.e. this is the formal error of measurement i). Hence 

S2 = (<*?) — E ^ ST e ?] * s t ne average of the estimated 
squared errors. Here fi2 the true value of the second moment. 
Therefore, our best estimate for the second moment of the 
underlying population is: 



fJ-2 



1 N 



S2- 



(16) 



Similarly, the expectation value of the fourth moment 



£[m 4 ] = E 



1 



= L44 + 3Sa + 6^2S2, (17) 



where we have used that the fourth moment of a normal 
distribution is 3a 4 . Therefore our estimate for the fourth 
moment is: 



/'4 



(18) 



where we have assumed [12 ~ P-2- 

The variance of the second moment var(m2), can be 
determined using var(a;) = E[x 2 ] — (E[x]) 2 , which yields 



var(rri2) 



N 



(fM - (J-2 + 2S 2 + 4/X2S2) 



(19) 



Although we formally need var(/i2), we have found by test- 
ing with a Gaussian distribution, that for our purposes 
var(/t2) ~ var(m2). 

For the variance of the fourth moment we find: 

var(m 4 ) = ^8 + 1054 + 204/i 4 s| + 420/x 2 s| + (20) 
28/i 6 s 2 - 9s$ (21) 

which require the 6 th and 8 th moments: 

E[me] = /i6 + 15^4 S2 + 45^2 S2 + 15^2, (22) 
E[m S ] = Ms + 210/i4sl + 28/i 6 s 2 + 420/i2s|, (23) 

and again we use var(/t 4 ) ss var(?n 4 ). 

The likelihood of the kinematic data given a model is: 



^(kinematic data| model) tx e 



>\ki, 



(24) 



^bins ,„ ,2 N bine 

2 _ Ufg, k - H2,k) \ -> 

Xkin 4- var(A 2 , fc ) + ^ 



(/t 4 ,fc — m.kY 

var(/i 4ife ) 



(25) 



Here fi2,k is given by Eq. Q, (i2,k is the estimate from the 
data for bin k and similarly for the fourth momenlQ. 



2.3.3 Finding a solution 

We need to find the Cij that maximise the probability (Eq. 
I24[) or minimise the Xkim under the condition that all qj are 
positive (and sum up to unity) and the light distribution is 
reproduced to within 1 per cent. This problem can easily be 
solved by quadratic programming (QP), since the minimi- 
sation is quadratic in the df coefficients, and the constraints 
are linear. Note however that for this non-parametric prob- 
lem, the parameter space is very large, and a solution will 
often yield an unrealistically spiky df. To effectively reduce 
the parameter space and yield a s moother df, we add a regu- 
larisation constraint, in ana logy to lCretton et al.l (1 19991 ) and 
Ivan den Bosch et al.l (|2008T ). by including a penalty term to 
the total x 2 - This term has the form: 



2 _ 2 , 2 

Xreg Xreg,E ' Xreg.Li 



(26a 



2 

Xrcg,E 



2 

Xrcg.L 



N L N E -1 
j=0 »=1 



i+1,3 



(26b) 



-1 JV R 



j=l i=0 



where X^g.E and \ 2 



reg,L 



(26c) 

are small for a smooth df. This 
smoothness requirement is implemented by demanding the 
second order derivatives of the df to be small, which we com- 
pute by taking second order finite differences (Eqs. [26b]|26cp . 

In our case we found Al = Ab/8 to work well, and we 
calibrate A_b in the next section. The £i terms are the inverse 
of the (normalised) masses inside the radii defined by our 
energy grid ( M2.2[l . (see also Ivan den Bosch et al]|2008l . Eq. 
29). Since the regularisation term Xre S is quadratic in the df 
coefficients, it can also be optimised using the QP. 
The total x 2 now becomes: 

X = Xreg + Xkin, (27) 

Minimising this equation, in combination with the linear 
constrains of the dj and the linear constraints on the light 
distribution (Eq. I14[) defines the problem for the QP. 



3 TESTING THE METHOD 

3.1 Plummer profile embedded in an NFW dark 
matter halo 

3.1.1 Mock Sculptor 

We now create a mock galaxy that may be representative of 
Sculptor according to previously published dynamical mod- 



1 Here we have neglected correlations between the moments, al- 
though these may exist in practice. 



6 M. A. Breddels et al. 



I I I 1 I I I 1 , I I 1 I , I 1 I 
" f^TTn — r-i- -■- 


11,111 






1 Ql , 1 1 1 , , , 1 1 , 1 1 1_1 1 1 

1,u 0.25 0.5 0.75 1.0 1.25 1.5 
R (kpc) 

20.0i 



15.0- 




"0 0.25 0.5 0.75 1.0 1.25 1.5 
R (kpc) 

Figure 1. The linc-of-sight velocity dispersion (bottom) and kur- 
tosis (top) for mock Sculptor. Black symbols:: Values for the 
moments in radial bins from the mock Sculptor data, with la 
error bars. Blue contours: Recovered profiles from the models, 
where the regions correspond to the 68.3, 95.4 and 99.7 per cent 
confidence intervals. 
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Figure 2. Result of a test of our Schwarzschild code on mock 
Sculptor. Here we have assumed knowledge of the df coefficients 
and recovered the intrinsic properties of the model. 

els of this system (|Battaglia et al.ll2008l ). The goal is to test 
our method in the region of parameter space where we ex- 
pect Sculptor to be. For the stellar component we choose 
a Plummer profile with total mass M* = 10 6 M© and a 
scale radius b = 0.3 kpc. The stellar component is em- 
bedded in a spherical NFW dark matter halo with scale 
r s — 0.5 kpc, and enclosed mass at 1 kpc of Mdm(< 1 kpc) = 
1O 8 M . The radial density profile for the NFW halo is of 
the form pdm(t) = po(r/r s )~ (1 + r/r B )~ 2 . We set the 
velocity anisotropy to be constant, f) = —1. Recall that 
/3(r) — 1 — crt(r)/(jr(r) and erf (r) (where erf = cr^ = a% for 
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Figure 3. Result of the application of the Schwarzschild code on 
our mock Sculptor. The figure shows that the intrinsic structure 
is recovered through the QP when the underlying gravitational 
potential is known. 



every r) and <7r{r) are the second moments of the intrinsic 
velocity distribution at radius r in the tangential and radial 
directions respectively. By assuming the df to be separable, 
i.e. f(E,L) = fE(E)fL{L), we may compute it explicitly 
(numerically) as described in Appendix El 

As an extra check that our model galaxy is physical 
and stable, we have generated phase-space coordinates for 
100 000 stars from its df, an d simulated it numerically using 
GADGET-2 (|Springe]||2005l ). In this simulation the stars are 
represented as N-bodies and they are embedded in the static 
potential given by the dark halo of our mock Sculptor model. 
We found that, even after 10 Gyr of evolution, the density 
distribution, velocity dispersion profiles and the anisotropy 
match the initial values well. 

To generate observations of our mock Sculptor we could 
draw a random sample of ~ 2000 stars from its distribution 
function. However, this has the disadvantage that many re- 
alisations would be required to test if the mean of the recov- 
ered quantities matches the known input values. Therefore, 
for the purposes of testing our modelling technique we prefer 
to compute the moments of the line-of-sight velocity distri- 
bution at different radii directly from the known distribution 
function, as this is less susceptible to randomness. We add 
uncertainties in the moments and choose the location of the 
radial bins to match the Sculptor data set. Fig. [T] shows 
the line of sight velocity dispersion profile and the kurtosis 
derived in this way. Note however in the model fitting we 
use the second and fourth moments since these are linear 
in the df coefficients. We calculate the uncertainties in the 
moments using Eqs. [T5] and 1211 assuming no measurement 
errors since these contribute only ~ 1% of the error budget 
for the typical measurement errors of 2 kms -1 and line of 
sight velocity dispersions of 10 kms -1 found in dSph. There- 
fore the uncertainties in the moments are only due to the 
number of objects per bin. Here we we choose to have 250 
stars per bin, which gives a total 8 bins for a sample of 2000 
objects. 

We proceed to test our code in two steps. In the first in- 
stance our aim is to establish how well the method recovers 



Orbit-based dynamical models of the Sculptor dSph galaxy 7 



the intrinsic properties of our mock galaxy if the df is known. 
Thus in this first test we use the known df to compute the 
df coefficients. These define the orbital weights which our 
Schwarzschild code uses to calculate the observables. The df 
coefficients are shown in the upper left panel of Fig. [2] The 
recovered (normalised) mass per intrinsic (3d) bin (Eg. If 2 [I . 
is plotted in the top right panel of the same figure. The red 
dashed curve shows the output of the Schwarzschild code, 
while solid black corresponds to the true values. In the lower 
left panel we plot the velocity dispersions for the radial (red) 
and tangential (green) directions. The solid curves indicate 
the true values, whereas in dashed we showed the recovered 
dispersions. Here the "true" velocity dispersion has been 
calcu lated using the Jeans equations (jBinnev fc Tremaind 
l2008i . chapter 4). The lower right panel shows the true (solid 
black) and the recovered (dashed red) anisotropy as a func- 
tion of radius. This exercise shows that given the correct 
weights we are indeed able to recover the known intrinsic 
properties of our mock galaxy. 

The small deviations from the true values especially vis- 
ible in the anisotropy profile are expected since the df coef- 
ficients only approximate the true df. These deviations can 
thus be removed by increasing the number of df coefficients. 
For example, if we double the number of coefficients in the 
energy and angular momentum directions, the small offset 
between the true and recovered anisotropy profiles disap- 
pears. The increase in the resolution in the energy direction 
also leads to the elimination of the wiggles in the anisotropy 
profile. On the other hand, the turnover of the anisotropy 
profile seen at small radii is related to the sampling of or- 
bits with the highest binding energy. Recall that we sam- 
ple orbits from a minimum radius r m i n ~ 0.03 kpc, so that 
the highest binding energy radial orbit has its apocentre 
at r m i n . The orbits that contribute to the region r < r m i n 
are those which are very elongated with pericentres inside 
this radius and with large apocentres (beyond r m i n ), and 
the set of orbits with the highest binding energy but which 
have more angular momentum. These more circular orbits 
only contribute within a small range in radii, and hence the 
resulting velocity ellipsoid is radially biased. Clearly if we 
were to reduce r m i n , i.e. increase the sampling of orbits in 
the central regions, this will lead to a decrease in the ra- 
dius at which the velocity anisotropy turns over. However, 
we deem this unnecessary as the amount of mass associated 
to this region is negligible, and this regime is in fact out- 
side the reach of observations since we only have access to 
observables along the line-of-sight, and a star at small pro- 
jected radius could be located at larger physical radii from 
the centre. Furthermore, the size of the currently available 
data sets is a strongly limiting factor (see next paragraph). 

We now use the full Schwarzschild method, and solve 
for the df using QP. For the regularisation parameters we 
found \e =0.1 to give good results. Fig. [3] summarises our 
findings. The overall properties of the df are well recovered 
as well as the remaining characteristics (see Fig. [2] for com- 
parison). The anisotropy is recovered accurately except for 
r < 0.05 kpc. This is not due to sampling of highly-bound 
orbits discussed above, but is mostly driven by the small 
number of stars in this (3d) inner region. Running the same 
experiment with a larger data set (10 000 and 50 000 stars) 
we see the mismatch in the anisotropy to occur at smaller 




-0.5 0.5 
log r 5 /kpc 



0.25 0.5 0.75 1.0 1.25 1.! 
r (kpc) 



Figure 4. Left column: Probability density functions (joint and 
marginalised) for mass and scale parameters of the NFW dark 
matter halo potential recovered for mock Sculptor model. Blue 
dot and blue lines (left column) indicate the maximum likelihood 
value (of the unmarginalised pdf), while the red dot and verti- 
cal dashed lines indicate the input values for the mock Sculptor 
model. The green solid line indicates the median value and the 
blue regions (or black contour lines in the top left panel) the 68.3, 
95.4 and 99.7 per cent confidence intervals. Top right: Recovered 
anisotropy profile. Middle right: Recovered logarithmic density 
slope (see text) for the dark matter. Bottom right: Recovered 
enclosed mass profile. 



radii. In practise, this means that with the current data sets 
we are not sensitive to the anisotropy at r < 0.05 kpc. 



3.1.2 Global halo parameter recovery 

In the above tests we showed that the Schwarzschild code 
accurately recovers the df and therefore the kinematic prop- 
erties of our mock dwarf galaxy. This test was done assum- 
ing that the (enclosed) mass within 1 kpc of the NFW halo 
(Mikpc) and its scale (r s ) were known. We now focus on how 
to estimate these parameters directly. 

We proceed to calculate the probability of a model for 
a set of parameters values. In our case these parameters 
are Mik pc and r s . However, instead of calcu lating this prob- 
ability on a regular grid as done in e.g. iGebhardt et al.l 
(|2007t) and Ivan den Bosch et al.l (I2008TI. we use an adaptive 



method, similar to IGebhardt fc Thomas! (|2009| j. This first 



finds the probability density function (pdf) on a coarse grid 
and then determines where the pdf needs to be refined, and 
does so hierarchically. This allows us to obtain a relatively 
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smooth pdf via the evaluation of a small number of models. 
For each set of model parameters we calculate the (relative) 
probability as p oc e~2 Xkin fEq. l25p . This results in estimates 
of the best fit parameters, as well as in confidence intervals. 
We assume the prior on Mik pc to be uniform in log Mik pc in 
the range log Mik pc G [7.6, 8.20 and the prior on r s uniform 
in logr s in the range logr s 6 [—1,1]. 

The pdf for the parameters Mik pc and r s for our mock 
Sculptor model is shown in the top left panel of Fig. [4] The 
pdf is nicely centred on the input values Mik pc = 10 8 Mq and 
r s = 0.5 « KT - 3 kpc. The maximum likelihood value (blue 
dot or lines) almost equals the input value, where the small 
deviation is caused by the discretisation of the pdf. Although 
the enclosed mass at 1 kpc is recovered both accurately and 
precisely (mean Mi kpc = 1.02 x 10 8 00±0 03 M Q , correspond- 
ing to a 7% uncertainty, or Mi kpc = 1.02tE5;S™ x 10 8 M o ), 
the scale radius is more poorly constrained (mean r B = 
0.56 x 10 14 kpc, corresponding to a 37% uncertainty, or 
r s = 0.56^o'i5 kpc). Note that the marginalised pdf for 
Mik pc and r s are somewhat asymmetric (a reflection of what 
is seen in the upper left panel of Fig. [4j, and this leads to 
slightly biased mean values for the parameters of the model. 

Each Schwarzschild model (i.e. for a given Mik pc and r s ) 
results in a single anisotropy profile. To find the pdf of the 
velocity anisotropy profile one shoul d integrate (marg inalise) 
over all possible df coefficients (as in lMagorrianll2006T ) . How- 
ever this is not always feasible due to the high dimen- 
sionality of the parameter space required to specify the df 
(Ne x Ni = 160 for this model). Instead we take the single 
anisotropy profile of each model, and calculate the proba- 
bility density function for the anisotropy as a function of 
radius as follows: 

p(P\r) = J dM lkpc J dr s p(/3|r,M lkpc ,r s )p(M lkpc ,r s ). 

(28) 

We plot the median anisotropy as a function of radius in 
green in the top right panel of Fig. [4j together with the 
68.3, 95.4 and 99.7 percent confidence intervals in blue. Note 
however, that the anisotropy values at different radii are not 
independent. The input anisotropy is indicated by the red 
dashed line. The anisotropy seems to be reproduced quite 
accurately, except at small radii. Since our technique re- 
covers nearly perfectly the input values of the model, the 
anisotropy profile found is essentially equivalent to that de- 
rived in Fig. [3] The mismatch at small radii is explained 
in the previous section, and the apparent small uncertainty 
in the anisotropy in this region is likely due to not having 
marginalised over all orbit weights. 

In the central right panel of Fig. 2] we plot the logarith- 
mic slope (rj) of the dark matter density (/5dm) as a function 
of radius: 

= dJo^DM 

dlogr 

In the inner parts r] = — 1 and in the outer regions rj — —3 
due to our choice of the NFW profile. In the next section 
we also explore changing the inner slope, which makes this 
plot more meaningful and useful for later comparison. 

In the bottom right panel we plot the enclosed (dark 

2 Outside this interval the pdf is essentially zero. 



matter) mass as a function of radius. The least uncertainty 
in the enclosed mass is at r as 0.5 — 0.6 kpc. This radius is 
close to the half light rad ius r^ ii ~ 1.3b ~ 0.4 k pc where 
IWalker et all (|2009l . |201Ch and IWolf et all (|2010h find the 
enclosed mass to be most robustly determined and to be 
independent of anisotropy. 

The line of sight velocity dispersion and the kurtosis 
profiles obtained from the models are shown as the blue 
contours in Fig. [1] These have been computed in an anal- 
ogous manner to the anisotropy profile, i.e. as in Eq. Q28p . 
This figure shows that the resulting curves are in excellent 
agreement with the input profiles. 

To gain further confidence in our methodology, we have 
also performed a similar set of tests for different anisotropy 
profiles, while keeping the same stellar and dark matter den- 
sity profiles. In one case the anisotropy varied from /3 = — 1 
in the centre to /3 — +0.25 at larger radii (i.e. from tangen- 
tially to radially biased). The other case we have tested has 
an anisotropy profile that changes from j3 = at the cen- 
tre to ft = — 1 at larger radii (i.e. from radial to tangential 
anisotropy). Also in these cases all the quantities recovered 
are in excellent agreement with the input values, indicating 
that our methodology works well and is robust. 



3.2 Changing the dark matter halo density profile 

In reality we will not know the actual density profile of the 
dark matter halo hosting a galaxy like Sculptor, and we 
would like to determine this from the data. A particularly 
interesting quantity is the inner slope of the density profile 
since this depends on the nature of the dark matter particles 
themselves, i.e. whether it is cold, warm or self-inte racting 
|Avila-Reese et all l200ll : ISpergel fc Steinhardt|[200(il ) . 

Therefore, in this section we use our mock Sculptor, 
which is embedded in an NFW profile, but we assume a 
more general functional form to test the performance of our 
Schwarzschild method, i.e. we take: 

PdmM = po (r/r s ) a (1 + r/r s )" (3+a) , (30) 

such that for a = — 1 this reduces to the NFW case. For the 
orbit integration we need to know the potential (or rather 
the forces) generated by this density distribution. Since no 
general analytic expression exists for these general poten- 
tials, we have to solve Poisson's equation numerically. We 
do t his using the FEM (Fini te Element Method) method 
(e.g. iPepper fc Heinricb|[l992l ). Our basis functions are La- 
grange polynomials of degree to 3 (cubic), which leads to a 
force field of order 2 (quadratic). We use a grid of 200 points 
in log radius, from r = 10~ 6 — 10 4 kpc. Testing this in the 
case of the NFW profile we find that the relative errors in 
the force in this range are ~ 10 -6 . 

We use our Schwarzschild code to find the best model 
that fits our mock Sculptor data, now with an additional un- 
known parameter a, assuming a uniform prior in the range 
G [—2,0] (a > corresponds to a central hole in the dark 
matter distribution, which we do not consider). The results 
are given in Fig. [5] The top row in shows the joint pdfs, 
marginalised over the remaining parameter. The middle row 
shows the pdfs of the single parameters, marginalised over 
the other two parameters. The blue dots and blue lines indi- 
cate the maximum likelihood value (of the unmarginalised 
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Figure 5. Top two rows: Probability density functions (joint and marginalised) for mass, scale and inner slope parameters of the dark 
matter halo potential recovered for our mock Sculptor model. Blue dots (top row) and blue lines (middle row) indicate the maximum 
likelihood value (of the unmarginaliscd pdf), while the red dot and vertical dashed lines indicate the input values for the mock Sculptor 
model. The green solid line the median value and the blue regions (or black contour lines in the top row) the 68.3, 95.4 and 99.7 per cent 
confidence intervals. Bottom left: Recovered anisotropy profile. Bottom centre: Recovered enclosed mass profile. Bottom right: 
Recovered logarithmic density slope (see text) for the dark matter. 



pdf). In the bottom row the recovered anisotropy, mass and 
density profile are shown. 

In general, all quantities are recovered quite well. How- 
ever, the pdf of a versus log r s shows an important degener- 
acy between these parameters, indicating that it is hard to 
determine either of these quantities reliably from our mock 
data set. The maximum likelihood (the blue dot) is slightly 
offset from the input value (red dot), which may indicate 
small systematic errors due to for instance the discretisation 
of the distribution function. However, note that since this 
systematic offset is in the direction of the degeneracy, the 
systematic error is small compared to the statistical uncer- 
tainty and therefore we do no consider this to be a problem 
for data sets of this size and quality. This analysis suggests 
that the current data is not sufficient to provide a good es- 
timate of the inner slope for these models. The limitation 
lies in the number of stars with spectroscopic measurements 
(which in the case tested here is 2000) and/or their spatial 
distribution. 



4 APPLICATION TO THE SCULPTOR DSPH 
GALAXY 

4.1 Data and extracted velocity moments 

We use the line-o f -sight velocities of iBattaglia et all (|2008l ) 
and IWalker et all (|2009l ) to create a velocity dispersion pro- 
file. To this end we first need to convert the measurements of 
the heliocentric line-of-sight velocities into line-of-sight ve- 
locities that take into account the space motion of Sculptor. 
We provide below a brief summary of this procedure and 
refer the reader to Appendix [B] for more details. 

The heliocentric line-of-sight velocities of Sculptor's 
stars are shown in Fig. [6] As can be seen from this figure, 
there appears to be a velocity gradient al ong the major axis 
(see also Fig. 1 of IBattaglia et al.l l2008h . The presence of 
such a gradient coul d be due to intri n sic ro tation in Sculp- 
tor, as suggested by IBattaglia et ail ([2008) . On the other 
hand it is also possible that the gradient is a result of the 
projection of the proper motion of the centre of mass of 
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Figure 6. Top: Line of sight velocities for Sculptor versus radius. 
We only use the stars at radii r < 3400 arcsec. The grey line indi- 
cates the systemic radial velocity for Sculptor and red lines ±3cr 
the mean velocity dispersion. Bottom: Heliocentric line-of-sigh t 
velo cities from th e com bined data set of [Battaglia et al.l j2008h 
and IWalker et al. The velocities have been smoothed by 

taking the median in cells of 0.2 degrees on a side. 



4-1.1 Velocity dispersion profile 

For our dynamic modelling, we need to calculate the ve- 
locity dispersion profile of Sculptor in radial bins. To this 
end, we initially make a rough selection of the likely mem- 
bers of Sculptor, and then perform a more thorough analy- 
sis including the effects of Milky Way contaminants. In the 
first step, we take the systemic heliocentric radial velocity 
(fSci,sys,hciio = 110.6 km s _1 ) an d the mean velocity di sper- 
sion (o-sd = 10.1 km s _1 ) from iBattaglia et al.l (|200Sl ). We 
require that the member stars are within 3a of the systemic 
velocity of Sculptor, as indicated by the red solid lines in the 
right panel of Fig. [5] Furthermore we also require that they 
are located within r < 3400 arcsec (~ 0.94 deg, 1.3 kpc), 
indicated by the green dashed line in the same panel. We 
add this requirement since we are not confident that outside 
this radius a reliable velocity dispersion can be measured 
due to the low number density of (probable) Sculptor mem- 
bers compared to Milky Way stars. An improved method 
for discriminating Milky Way c ontaminants ba s ed on sur- 
face gravity in the data set of Battaglia et al. (200c|) has 
been developed by (IBattaglia fc Starkenburdl201lT ). see also 
IWalker et all (|2009l ). 

We then define radial bins such that each bin contains 
at least 250 stars that match these criteria. From the total 
of 2306 stars, 1695 match the above two criteria, resulting 
in 6 radial bins, where the last one contains 445 stars. 

After we defined our bins to include at least 250 prob- 
able members, we remove the requirement of being within 
3cr of the systemic velocity of Sculptor. We now only require 
r < 3400 arcsec (= 1.3 kpc), so all stars below the green 
dashed line in top panel of Fig. [6] are considered for calcu- 
lating the velocity dispersions (2153 stars). We now use a 
model for the velocity distribution of the foreground con- 
tamination and of Sculptor itself, which then allows us to 
calculate the most likely velocity dispersion in each radial 
bin. 

Following IBattaglia et al.l (|2008l ) and IWalker et all 
(|2009l ) we model the velocity distribution in a radial bin 
as a sum of Gaussians. The velocity distribution of Sculp- 
tor itself is modelled as a single Gaussian, while that of the 
Milky Way is m o delled as a sum of two Gaussian^, following 
IBattaglia et al.l (|2008l ). Then the probability of the velocity 
dispersion of Sculptor in radial bin j with data Dj is: 



p(a j \D j ) = p( - D ^ a ^ p( - a ^ = p( D 3,i\ a j)p( a j) 



p(Dj 



n 



(31) 



Sculptor (or a mix of bo th), in which case it can be used 
to infer its space velocity l| Walker et al.|[2008l ). In absence of 
independent and direct measurements of the proper motion 
of Sculptor, it remains debatable what the source of the gra- 
dient is. For simplicity, here we assume that Sculptor does 
not rotate and we derive the velocity of the centre of mass 
of Sculptor from the line-of-sight measurements in Appendix 
IB1 We note that in practice, our procedure simply removes 
the gradient, which one might say is equivalent to having 
removed (solid body) rotation. 



i 

+p(Rj,i,Vj,i,->m\<T j )) , 

where Rjj and Vj :i are the radius and velocity of the i th 
star in the f bin, Nj is the number of stars in bin j, 
p(o~j) is the prior, which we take flat between the range 

3 This gives a good fit to the Besangon model in this region of 
the sk y and for stars with colours and magnitudes in the observed 



range llRobin et al.ll2003h . 
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Figure 7. Line-of-sight velocity dispersion pro file o btained in ra- 
dial b ins for the data bv lBattaglia et ah I feoosl) and lWalker et all 
d2009T ) of Sculptor, taking into account the foreground contamina- 
tion by the Milky Way. The dashed curve corresponds to the (pdf 
weighted) median line-of-sight velocity dispersion profile from the 
Schwarzschild models presented in Sec. 4.2, while the contours in- 
dicate the 1, 2, and 3<r uncertainties around this curve. The last 
bin extends to 1.3 kpc. 



For each radial bin we find the maximum likelihood 
value for the velocity dispersion. After this, we perform a 
3cr clipping around the mean, and estimate the second and 
fourth moments for the remaining stars using Eqs. (|16|) and 
(fl"8)l . The errors are computed from Eqs. ([19]) and l[2lj). Fig. [7] 
shows the resulting velocity dispersion profile and the kur- 
tosis (/t4//i|). The final sample contains 1696 member stars. 



4.2 Schwarzschild method applied to Sculptor 

We now apply the Schwarzschild method to the data from 
the Sculptor dSph and model this galaxy as a (non-rotating) 
spherically symmetric system. For the light distribution we 
assume a Plummer profile with scale radius b = 0.3 kpc 
(Battaglia 2007). 

We first assume that Sculptor is embedded in an NFW 
dark matter halo, as we did for mock Sculptor in H3.ll 
The results of this modelling are shown in Fig. [8] We ob- 
tain a tight constraint on the enclosed dark matter mass of 
M lkpc = 1.03 x 1O 8 ' OO±O O3 M (7% uncertainty, or M lkpc = 
L03 ±o:o70 x 1O S M ). The scale radius at r 3 = 2.15 x 10 ±025 
kpc (76% uncertainty, or r s = 2.15iJ'g3 kpc) is less well 
constrained, similar to what we find for mock Sculptor. In 
comparison to our mock model, the Sculptor dwarf galaxy 
would seem to have a larger scale radius (see Fig. [4}. 

Our estimates are consistent with those derived in pre- 
vious work for the NFW fa mily of mass models. For example, 
IWalker et ail (|2009l. 120101 ) derive a mass of lut" x 10 7 M o 
within 1.1 kpc, while we estimate lOli'j x 10 7 M o within the 
same distance with sm aller error bars. On the other hand, 



within 1.8 kpc, while our measurement at this radius is 



< (Jj ^ 30 km s _1 and m and indicate the Boolean iBattaglia et al.l (|2008l ) obtained a mass of 2.2±o.? x 10 8 M o 
value of being a member star of Sculptor or not. The pro- 
portionality can be used since the denominator is a normal- 
isation constant. The first terms in the last line of Eq. I)3ip 
can be expanded further (for each j): 



1.9±o 3 x 10 8 Mp). The m ass estimates by IStrigari et al 



2008) IWalker et all (|201p| . MCMC value) and IWolf et a! 



20101 ) are over plotted in the bottom right panel of Fig. [H 



p(Ri, Vi, m\a) = p(Ri,Vi\m, a)p(m\a) 
= p(Ri,Vi\m,a)p(m) 
= p(Ri\m)p(vi\m, a)p(m) 



(32) 



We take the prior on membership, to be equal p(m) = 



p(-^m) 



Using the model of Sculptor as described above, 



p(Ri\m) = ^sd(Ri), the normalised surface density and 
p(vi\m, a) is a Gaussian convolved with the individual mea- 
surement errors on Vi. In the case of duplicates (these are 
stars in common in the data sets) we average the line-of-sight 
velocities and the errors (in quadrature). Two observations 
are considered to be from the same star when the astrome- 
try agrees within 1 arcsec, and a velocity difference less than 
3e, where e is the average velocity error. 

The second term in Eq. (|31|) can similarly be derived 
by replacing m with -^m in Eq. (|32l) . p(Ri\^rn) = fiMW is 
the density of the Milky Way foreground. Since the normal- 
isation is not important, we only need to know the ratio 
Mmw / MSci(-Ri) in each bin. This can be estimated by the 
ratio of stars outs ide the 3<t and inside the 3<r velocity dis- 
persion, similar to IBattaglia et al] (|2008l ). Note that if there 
is any bias in the sampling of the kinematic data (which 
usually is the case), it will affect both the Sculptor data and 
the foreground data in equal ways, and will cancel out in 
the ratio. The term p(vi\ ^m, a) is the weighted sum of two 
Gaussians as described in IBattaglia et alj (|2008l ) . 



and all three agree very well with ours and are within the 
confidence regions. 

The top right panel of Fig. [S] shows that Sculptor's 
anisotropy is mostly tangential and fairly constant with ra- 
dius, except near the centre where it becomes slightly more 
isotropic (even after talking into account our limitations due 
to the projection effects shown and discussed in the context 
of Fig. [3]). This anisotropy profile at r > 0.1 kpc is consis- 
tent with the con stant anisotropy ass umed in Jeans models 
of Sculptor, as by I Walker et al. I (|2007l ). who find /3 = -0.5. 

We plot the joint pdf of Mik pc and r s again in Fig. [9] 
In the left panel we plot lines of constant virial mass 
M200 in bluqj, with the blue dotted line indicating a value 
of logM2oo = 8.5, increasing with steps of 0.5 dex until 
logM2oo = 10.5. Orange lines indicate constant concentra- 
tion values, with the orange dashed line corresponding to 
c = 10, increasing with steps of 5 until c = 40. This shows 
that the concentration of Sculptor is ~ 15 ± 6 and that the 
virial mass is not well determined (not better than within 
factor of 100 at a 3<r level uncertainty). 

Cosmological N-body simulations of dark matter have 



4 M2C10 is the virial mass (mass enclosed within r2oo)> where r2oo 
is the distance at which the average density of a dark matter halo 
is 200 times the cosmological density p c (e.g. iBinnev fe Tremainel 
|200S| . §2.2). 
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Figure 8. Left column: Probability density functions (joint and 
marginalised) for mass and scale parameters of the NFW dark 
matter halo potential recovered for Sculptor. Blue dot and blue 
lines (left column) indicate the maximum likelihood value (of the 
unmarginalised pdf). The green solid line indicates the median 
value and the blue regions (or black contour lines in the top left 
panel) the 68.3, 95.4 and 99.7 per cent confidence intervals. Top 
right: Recovered anisotropy profile. Middle right: Recovered 
logarithmic density slope (see text) for the dark matter. Bottom 
right: Recovered enclosed mass profile. 



shown that there is a relation between the concentration 
of dark matter halos and their virial masses, the s o called 
mass-concentration relation (e.g. lBullock et al1l200ll ). In the 
right panel of the Fig. [9j we show as the dashed black line 
the mass-concentration relation of lMaccio et al.l (|2007l ): 



logcaoo = -0.109 log(M 20 o/M Q ) + 2.34. 



(33) 



Judging solely from this relationship this would suggest that 
Sculptor is not compatible with the current ACDM cosmol- 
ogy. If we however plot the intrinsic scatter of ai n C200 = 0.33 
in the same panel (solid red lines, 1,2 and 3a contours) we 
see that Sculptor lies well within the 1 and 2a contours. We 
can also use the mass-concentration relation as a prior in 
our models. The results are shown as the green contours in 
this figure and they are slightly smaller than the original 
contours. The effect is small, but leads to a narrowing down 
of the possible values for r s . 
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Figure 9. Left: The black contours correspond to the same pdf 
as that shown in the bottom right panel of Fig. \E\ Blue lines 
indicate curves of constant M20O1 with the blue dotted line cor- 
responding to a value of logM2oo = 8.5, increasing with steps 
of 0.5 dex until logM2oo = 10.5. Orange lines indicate values of 
constant concentration, with the orange dashed line correspond- 
ing to c = 10, increasing with steps of 5 until c = 40. Right: Red 
contour lines indicate the cosmologically motivated prior, with 
the black dashed line the mean value. The green contours are the 
pdf obtained using this prior for Sculptor. 



4.3 Dark matter inner density profile 

We now consider a more general dark matter profile for the 
dark matter halo of Sculptor as we did for our mock models 
in c]3.2l by allowing in the inner slope a to vary (see Eg. 1301) . 
The results are shown in Fig. 1101 

Fig. [10] shows that the maximum likelihood value for 
Mikpc and that the velocity anisotropy recovered by the 
Schwarzschild method are in very good agreement with the 
values obtained when a is fixed to —1 to vary as in Fig. [S] 
However, as discussed in Sec. I3.2l the strong degeneracy be- 
tween r s and a implies that the scale radius is less well 
determined. 

The middle right panel of Fig. 1 101 shows that the distri- 
bution of values for the inner slope a is very broad. Nonethe- 
less it is clear that very steep cuspy profiles (a < —1.5) are 
excluded. The maximum likelihood value is reached for a 
cored profile (a = 0), although this is statistically indistin- 
guishable from slightly cuspier slopes as evidenced by the 
pdfs in this Figure. The bottom right panel of Fig. 1 101 shows 
that at a distance of 250 pc (where the anisotropy profile 
begins to change its shape, and which according to our tests 
in Sec. 13.21 is the inner most point where it is reliably de- 
termined) the median logarithmic slope profile (green line) 
takes a value of ~ —1.25, which is larger that found in our 
mock Sculptor model (~ —1.75). Since the maximum likeli- 
hood value of r s estimated by the Schwarzschild method is 
not very different from that assumed in mock Sculptor, this 
comparison would suggest that the density profile of Sculp- 
tor is shallower than NFW, although the uncertainties are 
still too large to make a very firm statement. 

Our results agre e with p revious studies of Sculp- 



tor jBattaglia et all |200S| ; lAmorisco fc Evans! l201ll ; 
IWalker fc Penarrubial l201ll ) that a core is more likely, 
however we do not find the evidence strong enough to 
rule o ut a cusp. The approach used by lAmorisco k, Evans] 
(2011) is to assume that the phase-space distribution 
function follows the Michie-King form, and to fit simulta- 
neously the metal-rich and metal-poor populations found 
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in Sculptor. We believe that it is the first assumption that 
might lead to the conflicting results, since Michie-King 
models only allow radial anisotropics, while we find, with 
high confidence levels, that the orbits of stars in Sculptor 
are tangentially biased, especially at radii beyond 250 pc, 
where the dominant population is the metal-poor one. Their 
results thus seem to be driven by the fact that Sculptor 
has a relatively flat velocity dispersion profile, and that in 
order to reproduce this under the assumption of radially 
anisotropic orbits forces the density p rofile to change slowly 
with radius, i.e. to be cored. Also IWalker fc Penarrubial 
(|201ll ) find strong evidence for a core. It is more difficult 
to make a comparison to this work because of the very 
different method used to estimate the inner slope, although 
it should be recalled that these authors use the existence of 
at least two distinct populations, and this may or may not 
be the cause of the difference. 



5 DISCUSSION AND CONCLUSIONS 

We have presented a spherically symmetric dynamical 
model for the Sculptor dwarf spheroidal galaxy using the 
Schwarzschild orbit superposition method. This method fits 
a set of observables, which in our case are the light, the 
second and fourth moments of the line of sight velocity dis- 
tribution. We have tested this method on a mock model for 
the Sculptor dSph galaxy embedded in an NFW profile, and 
generated with similar sampling and velocity errors as the 
data currently available for this system. 

In our tests we have found our method to give precise 
(7% uncertainty) and accurate estimates for the mass within 
I kpc, when assuming that the underlying gravitational po- 
tential is of NFW form. However the scale radius is recov- 
ered less precisely (37% uncertainty) for data sets containing 
~ 2000 member stars. We have also explored a more gen- 
eral model for the dark matter halo and found that we are 
able to measure the logarithmic slope of its density profile, 
although the central value is weakly constrained. Nonethe- 
less we find that the maximum likelihood value for the inner 
slope is very close to the input value. 

We then used the Schwarzschild method on Sculptor 
after having estimated the second and fourth line of sight 
velocity moments for this galaxy. Assuming an NFW profile 
for the dark matter profile, we derive a mass within 1 kpc of 
Mi k p C = (1.03 ±0.07) x 10 s M , and find the concentration 
(c ~ 15) to be compatible with current ACDM predictions, 
given the expected s catter in the mass-concentration relation 
(|Macci6 etal1l2007h . When we try to constrain the inner 
slope of the dark matter density profile of Sculptor, we can 
exclude very cuspy profiles (a < —1.5). However, given the 
current data set, our method does not seem to be able to 
discriminate in a statistically significant way between a cusp 
and a core. We are, however, able to determine that the 
logarithmic slope of the density profile falls off to the value 
of —2 at a distance of ~ I kpc. 

The Schwarzschild method is also able to derive the ve- 
locity anisotropy profile, except near the centre where we are 
limited by the number of tracers. For Sculptor we find this 
to be tangentially biased with a hint that it may become 
more isotropic for r < 250 pc. This result is nearly inde- 
pendent of the assumed shape of the dark matter density 



profile, whether NFW or its generalised form. This nearly 
flat tangentially anisotropic ellipsoid should hold clues to 
the formation and dynamical evolution of Sculptor but it is 
as yet unclear whether a model exists that can reproduce 
this trend. 

Models in which stars follow the dark matter are in- 
consistent with our results, a s they predict a more radially 
anisotropic velocity ellipsoid (|Diemand et al.ll2004 ). On the 
other hand, the tidal stirring of a disky galaxy (see e.g. 
iMaveri l2O10h . can lead to a tangentially biased ellipsoid. 
However, this model predicts that the ellipsoid becomes in- 
creasingly tangential with radius as a consequence also of 
tidal stripping, and this is not we derive at face value. 

Schwarzschild modelling does not have to assume a 
parametric form for the velocity anisotropy as for instance 
in the commonly used Jeans modelling. We therefore believe 
that we are less affected by biases due to assumptions com- 
pared to such class of models. Furthermore, by construction 
we are guaranteed that our models are physical in the sense 
of having non-negative distribution functions. 

We plan to develop the Schwarzschild method further 
to work with the full line of sight velocity distribution, in- 
stead of binning the data and comparing it the the velocity 
moments profile. Avoiding the loss of information when bin- 
ning, we expect that this may give us better estimates for 
the inner slope and the anisotropy profile. Also, since nei- 
ther Sculptor nor any of the other dwarf spheroidal galaxies 
are spherical, we are developing a non spherical orbit-based 
dynamical model. We also plan to apply this modelling to 
other dwarf spheroidal galaxies such as Fornax, Carina and 
Sextans in future work. 
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Figure 10. Top two rows: Probability density functions (joint and marginalised) for mass, scale and inner slope parameters of the 
dark matter halo potential recovered for Sculptor. Blue dots (top row) and blue lines (middle row) indicate the maximum likelihood 
value (of the unmarginalised pdf). The green solid line indicates the median value and the blue regions (or black contour lines in the 
top row) the 68.3, 95.4 and 99.7 per cent confidence intervals. Bottom left: Recovered anisotropy profile. Bottom centre: Recovered 
enclosed mass profile. Bottom right: Recovered logarithmic density slope (see text) for the dark matter. 
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APPENDIX A: NUMERICAL 
APPROXIMATION TO THE DISTRIBUTION 
FUNCTION 

We take the following separable form for the distribution 
function: 

f(E,L) = f E (E)f L (L). (Al) 

For a constant anisotropy for instance, }l{L) oc Z/~ 2/3 . Now 
we assume that the distribution function f(E,L) can be 
approximated by f(E,L), where }e{E) is a sum of delta 
functions, such that: 



1 N 

f(E,L) = -^ Wi 5(E - Ei)f L (L). 



(A2) 



The density distribution corresponding to this distribution 
function is: 

v(r) = 2tt f r ' maX dv r f v t dvtf(E,L) (A3) 
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where is the Heaviside step function and the C*i(r) the 
densities that correspond to the each of the energy delta 
functions. 

Given a stellar density distribution v(r) and a gravita- 
tional potential $(r), it may be possible to find the weights 
Wi such that v{r) ~ v(r). in this case we may state that 
we have found a numerical approximation to the distribu- 
tion function that generates the proper stellar density dis- 
tribution and is embedded in the potential $(r). A solution 
can be found for instance using a non-negative least square 
method. An even simpler method is to start with the 9j 
corresponding to the lowest binding energy. All Vi associ- 
ated with higher binding energies can only contribute to the 
density at smaller radii, therefore by weighing Vj this can ac- 
count for the density out to the outermost radius. Now one 
can proceed with the next zV Thus we start from the low- 
est binding energy components, use appropriate weights and 
build the density distribution from outside in. Care should 
be taken to make sure all weights are positive. 

In the case of the mock Sculptor model discussed in the 
main text, v(r) is the Plummer profile and $(r) is the sum 
of the potentials of the Plummer mass distribution describ- 
ing the stellar component and that generated by the NFW 
profile associated to the dark halo. In this case we have cho- 
sen Jl(L) oc L~ 2/s , where /3 = —0.5. For our purpose we 
choose a logarithmically spaced radial grid of 600 points be- 
tween r m i n = 10 -3 kpc and r max = 10 3 kpc. For each n on 
the grid, we calculate the potential energy, giving us a grid 
of energies, which we take the energies for our distribution 
function (Ei in Eq. lA2l) . For each Ei we calculate the density 
on the same radial grid. The last step is to find the weights 
Wi using the above procedure. A small mismatch (few %) 



of the density at large radii (> 300 kpc) occurs due to the 
distribution function missing lower binding energy compo- 
nents. The cumulative mass distribution of the stellar mass 
deviates < 10~ 4 from the true mass distribution, and within 
300 kpc the relative density deviates < 2 x 10~ 4 %. Outside 
this radius the density does not match very well, but since 
this is at large radii and its mass contribution is very small 
(note also that the cumulative mass distribution shows only 
small deviations) this is of no importance. 



APPENDIX B: CENTRE OF MASS VELOCITY 
OF SCULPTOR 

In this Appendix we transform the observed line-of-sight 
velocities to velocities with respect to the centre of mass 
of Sculptor. This requires knowledge of the latter, which is 
what we derive here using a maximum likelihood method. 

The observed (heliocentric) line-of-sight velocity of a 
star can be expressed as: 

y*.hei{l, b) = ei os (l, b) ■ (v* t s c iQ, b) + v Sc i.gsr — v q> gsr) 
= w*,scz(i, b) + vscI,gsr(1, b) - v Q}G sr(1, b), 

where ei os (l,b) is the line-of-sight unit vector in the direc- 
tion of the star, w*,sci(i, b) the velocity of the star with re- 
spect to the centre of mass of Sculptor, vs c i,gsr the sys- 
temic velocity of the centre of mass of Sculptor with re- 
spect to the Galactic Standard of Rest (hereafter GSR), 
vq,gsr the velocity of the Sun with respect to the GSR 
and • indicates the inner product. The component of the 
line-of-sight velocity we are interested in is v, t sci(i, b). Since 
v*,hei{l,b) is measured, and assuming we know vq^gsr, we 
only need to find vsa,Gsa(l, b)- For the velocity of the Sun 
we use v 0i gsr = v@,lsr + v LS r,gsr = (10.0,5.2,7.2) + 

0,220,0) km s" 1 , where lsr denotes Local Standard of Rest 

Dehnen fc Binnevl Il998h . 

To determine which stars are likely members of Sculp- 
tor we make a rough first selection. We take the systemic 
heliocentric radial velocity (usci,sys,helio = 110.6 km s" 1 ) 
and the mean v e locity dispersion (<ts c i = 10.1 km s _1 ) from 
iBattaglia et al.l (|2008l ). We first require that the member 
stars are within 3a of the systemic velocity of Sculptor, as 
indicated by the red solid lines in the right panel of Fig. 
[5] Furthermore we also require that they are located within 
r < 0.944 degree, indicated by the green dashed line in the 
same panel. We add this requirement since we are not con- 
fident that outside this radius a reliable velocity dispersion 
can be measured due to the low number density of (proba- 
ble) Sculptor members compared to Milky Way stars. 

For simplicity we first assume that the line-of-sight ve- 
locity distribution is described by a Gaussian distribution 
with a constant velocity dispersion and zero mean velocity 
w.r.t the centre of mass of Sculptor. Then the probability 
for Vsd,GSR can be expressed as: 

( v, it s c i(li,b^ 2 



p(v 



Scl,GSR) 
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n 
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■ exp 



exp 
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(Bl) 



eio S (U, h) ■ (vq^gsr — v Sc i,gsr)Y 
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level, and the 2a and 3<r contours for the joint v x and v y 
overlap as well. Perhaps this level of disagreement could be 
taken as an indication that there may be intrinsic rotation 
in the system. Nonetheless, we note that with this proce- 
dure we effectively have removed the observed gradient and 
no apparent rotation remains, whatever its origin. 



Figure Bl. Probability distribution function (pdf) of the three 
velocity components of the systemic velocity of Sculptor with re- 
spect to the Galactic Standard of Rest. Left column: Joint pdfs 
with 1, 2 and 3 u contours lines, marginalised over the other 
component. Right column: Individual pdfs marg inalised over 
the ot her two components. The measure ments fromlPia tek et aD 
l l2006h are shown in red, while those by ISchweitzer et al. I [119951 1 
arc shown in blue. The vertical lines in the right panels and the 
dot in the left panels indicate the maximum likelihood values. 



where of = o% cl + of. is the velocity dispersion of Sculp- 
tor added in quadrature with the measurement error of the 
velocity of star i. Although the velocity dispersion is not 
constant with radius, we use the global value of asci = 
10.1 kms -1 as described previously. 

The joint and marginalised probability distribution 
functions for the velocity components of Sculptor are plotted 
in Fig. lBll together with the 1, 2 and 3 a contours. The maxi- 
mum likelihood value is reached at Vsd,GSR — ("n v y, v z) = 
(278. 5, 101.5, -81.0 ) kms " 1 . These values are in agreement 
with I Walker et all (|2008r >, who use a similar method. We 
also over plot the measur e ments of iPiatek et al.l (|200fj| . in 
red) and ISchweitzer et al] (|l995l . in blue) while the maxi- 
mum likelihood value is indicated in black. Note that the 
uncertainty in v z is smallest since this reflects mainly the 
uncertainty in the mean radial velocity of the centre of mass 
of Sculptor due to its high galactic latitude. The uncertain- 
ties in the other two velocity components mainly reflect the 
uncertainties in the proper motion measurements. Our de- 
termination of the v y component agrees well with the various 
data sets, while the v x component appears to be systemat- 
ically offset. Note however, that there is overlap at the 3<r 



